Goto

Collaborating Authors

 conductance value


OpenMENA: An Open-Source Memristor Interfacing and Compute Board for Neuromorphic Edge-AI Applications

arXiv.org Artificial Intelligence

Abstract--Memristive crossbars enable in-memory multiply-accumulate and local plasticity learning, offering a path to energy-efficient edge AI. T o this end, we present Open-MENA (Open Mimristor-in-Memory Accelerator), which, to our knowledge, is the first fully open memristor interfacing system integrating (i) a reproducible hardware interface for memris-tor crossbars with mixed-signal read-program-verify loops; (ii) a firmware-software stack with high-level APIs for inference and on-device learning; and (iii) a V oltage-Incremental Proportional-Integral (VIPI) method to program pre-trained weights into analog conductances, followed by chip-in-the-loop fine-tuning to mitigate device non-idealities. OpenMENA is validated on digit recognition, demonstrating the flow from weight transfer to on-device adaptation, and on a real-world robot obstacle-avoidance task, where the memristor-based model learns to map localization inputs to motor commands. OpenMENA is released as open source to democratize memristor-enabled edge-AI research. We release all hardware design and software material as open source at: https://tinyurl.com/mr592wuf


Synaptic metaplasticity with multi-level memristive devices

arXiv.org Artificial Intelligence

Deep learning has made remarkable progress in various tasks, surpassing human performance in some cases. However, one drawback of neural networks is catastrophic forgetting, where a network trained on one task forgets the solution when learning a new one. To address this issue, recent works have proposed solutions based on Binarized Neural Networks (BNNs) incorporating metaplasticity. In this work, we extend this solution to quantized neural networks (QNNs) and present a memristor-based hardware solution for implementing metaplasticity during both inference and training. We propose a hardware architecture that integrates quantized weights in memristor devices programmed in an analog multi-level fashion with a digital processing unit for high-precision metaplastic storage. We validated our approach using a combined software framework and memristor based crossbar array for in-memory computing fabricated in 130 nm CMOS technology. Our experimental results show that a two-layer perceptron achieves 97% and 86% accuracy on consecutive training of MNIST and Fashion-MNIST, equal to software baseline. This result demonstrates immunity to catastrophic forgetting and the resilience to analog device imperfections of the proposed solution. Moreover, our architecture is compatible with the memristor limited endurance and has a 15x reduction in memory


Analog Neural Computing with Super-resolution Memristor Crossbars

arXiv.org Artificial Intelligence

Memristor crossbar arrays are used in a wide range of in-memory and neuromorphic computing applications. However, memristor devices suffer from non-idealities that result in the variability of conductive states, making programming them to a desired analog conductance value extremely difficult as the device ages. In theory, memristors can be a nonlinear programmable analog resistor with memory properties that can take infinite resistive states. In practice, such memristors are hard to make, and in a crossbar, it is confined to a limited set of stable conductance values. The number of conductance levels available for a node in the crossbar is defined as the crossbar's resolution. This paper presents a technique to improve the resolution by building a super-resolution memristor crossbar with nodes having multiple memristors to generate r-simplicial sequence of unique conductance values. The wider the range and number of conductance values, the higher the crossbar's resolution. This is particularly useful in building analog neural network (ANN) layers, which are proven to be one of the go-to approaches for forming a neural network layer in implementing neuromorphic computations.


Clustering of Big Data with Mixed Features

arXiv.org Machine Learning

Clustering large, mixed data is a central problem in data mining. Many approaches adopt the idea of k-means, and hence are sensitive to initialisation, detect only spherical clusters, and require a priori the unknown number of clusters. We here develop a new clustering algorithm for large data of mixed type, aiming at improving the applicability and efficiency of the peak-finding technique. The improvements are threefold: (1) the new algorithm is applicable to mixed data; (2) the algorithm is capable of detecting outliers and clusters of relatively lower density values; (3) the algorithm is competent at deciding the correct number of clusters. The computational complexity of the algorithm is greatly reduced by applying a fast k-nearest neighbors method and by scaling down to component sets. We present experimental results to verify that our algorithm works well in practice. Keywords: Clustering; Big Data; Mixed Attribute; Density Peaks; Nearest-Neighbor Graph; Conductance.


Leveraging Model Interpretability and Stability to increase Model Robustness

arXiv.org Machine Learning

State of the art Deep Neural Networks (DNN) can now achieve above human level accuracy on image classification tasks. However their outstanding performances come along with a complex inference mechanism making them arduously interpretable models. In order to understand the underlying prediction rules of DNNs, Dhamdhere et al. propose an interpretability method to break down a DNN prediction score as sum of its hidden unit contributions, in the form of a metric called conductance. Analyzing conductances of DNN hidden units, we find out there is a difference in how wrong and correct predictions are inferred. We identify distinguishable patterns of hidden unit activations for wrong and correct predictions. We then use an error detector in the form of a binary classifier on top of the DNN to automatically discriminate wrong and correct predictions of the DNN based on their hidden unit activations. Detected wrong predictions are discarded, increasing the model robustness. A different approach to distinguish wrong and correct predictions of DNNs is proposed by Wang et al. whose method is based on the premise that input samples leading a DNN into making wrong predictions are less stable to the DNN weight changes than correctly classified input samples. In our study, we compare both methods and find out by combining them that better detection of wrong predictions can be achieved.


Acceleration of Deep Neural Network Training with Resistive Cross-Point Devices

arXiv.org Machine Learning

In recent years, deep neural networks (DNN) have demonstrated significant business impact in large scale analysis and classification tasks such as speech recognition, visual object detection, pattern extraction, etc. Training of large DNNs, however, is universally considered as time consuming and computationally intensive task that demands datacenter-scale computational resources recruited for many days. Here we propose a concept of resistive processing unit (RPU) devices that can potentially accelerate DNN training by orders of magnitude while using much less power. The proposed RPU device can store and update the weight values locally thus minimizing data movement during training and allowing to fully exploit the locality and the parallelism of the training algorithm. We identify the RPU device and system specifications for implementation of an accelerator chip for DNN training in a realistic CMOS-compatible technology. For large DNNs with about 1 billion weights this massively parallel RPU architecture can achieve acceleration factors of 30,000X compared to state-of-the-art microprocessors while providing power efficiency of 84,000 GigaOps/s/W. Problems that currently require days of training on a datacenter-size cluster with thousands of machines can be addressed within hours on a single RPU accelerator. A system consisted of a cluster of RPU accelerators will be able to tackle Big Data problems with trillions of parameters that is impossible to address today like, for example, natural speech recognition and translation between all world languages, real-time analytics on large streams of business and scientific data, integration and analysis of multimodal sensory data flows from massive number of IoT (Internet of Things) sensors.


Dopaminergic Neuromodulation Brings a Dynamical Plasticity to the Retina

Neural Information Processing Systems

The fovea of a mammal retina was simulated with its detailed biological properties to study the local preprocessing of images. The direct visual pathway (photoreceptors, bipolar and ganglion cells) and the horizontal units, as well as the D-amacrine cells were simulated.


Dopaminergic Neuromodulation Brings a Dynamical Plasticity to the Retina

Neural Information Processing Systems

The fovea of a mammal retina was simulated with its detailed biological properties to study the local preprocessing of images. The direct visual pathway (photoreceptors, bipolar and ganglion cells) and the horizontal units, as well as the D-amacrine cells were simulated.